skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Jin, Yufei"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Graph Neural Networks (GNNs) have shown superb performance in handling networked data, mainly attributed to their message passing and convolution process across neighbors. For most literature, the performance of GNNs is mainly reported based on noise-free data environments. No study has systematically evaluated GNNs’ performance under noise. In this article, we carry out an empirical study and theoretical analysis of four types of GNNs, including Graph Convolutional Networks (GCNs), Graph Attention Networks (GATs), Graph Contrastive Networks (GCL), and graph UniFilter under three types of noise, including attribute noise, structure noise, and label noise. Our study shows that GNNs behave tremendously differently in response to different types of noise. Overall, GAT is the most noise vulnerable and sensitive, whereas GCL is the most noise resilient. We further carry out theoretical analysis to explain the reason causing GAT to be sensitive to noise, and propose a solution to enhance its noise resilience. Our study brings in-depth firsthand knowledge of GNNs under noise for researchers and practitioners to better utilize GNNs in real-world applications. 
    more » « less
    Free, publicly-accessible full text available July 31, 2026
  2. Free, publicly-accessible full text available July 17, 2026
  3. Free, publicly-accessible full text available August 16, 2026
  4. Free, publicly-accessible full text available December 9, 2025
  5. Globersons, Amir; Mackey, Lester; Belgrave, Danielle; Fan, Angela; Paquet, Ulrich; Tomczak, Jakub M; Zhang, Cheng (Ed.)
    Free, publicly-accessible full text available December 10, 2025
  6. This paper presents HGEN that pioneers ensemble learning for heterogeneous graphs. We argue that the heterogeneity in node types, nodal features, and local neighborhood topology poses significant challenges for ensemble learning, particularly in accommodating diverse graph learners. Our HGEN framework ensembles multiple learners through a meta-path and transformation-based optimization pipeline to uplift classification accuracy. Specifically, HGEN uses meta-path combined with random dropping to create Allele Graph Neural Networks (GNNs), whereby the base graph learners are trained and aligned for later ensembling. To ensure effective ensemble learning, HGEN presents two key components:1) a residual-attention mechanism to calibrate allele GNNs of different meta-paths, thereby enforcing node embeddings to focus on more informative graphs to improve base learner accuracy, and 2) a correlation-regularization term to enlarge the disparity among embedding matrices generated from different meta-paths, thereby enriching base learner diversity. We analyze the convergence of HGEN and attest its higher regularization magnitude over simple voting. Experiments on five heterogeneous networks validate that HGEN consistently outperforms its state-of-the-art competitors by substantial margin. Codes are available at https://github.com/Chrisshen12/HGEN. 
    more » « less
    Free, publicly-accessible full text available September 1, 2026
  7. Label Distribution Learning (LDL) has been extensively studied in IID data applications such as computer vision, thanks to its more generic setting over single-label and multi-label classification. This paper advances LDL into graph domains and aims to tackle a novel and fundamental heterogeneous graph label distribution learning (HGDL) problem. We argue that the graph heterogeneity reflected on node types, node attributes, and neighborhood structures can impose significant challenges for generalizing LDL onto graphs. To address the challenges, we propose a new learning framework with two key components: 1) proactive graph topology homogenization, and 2) topology and content consistency-aware graph transformer. Specifically, the former learns optimal information aggregation between meta-paths, so that the node heterogeneity can be proactively addressed prior to the succeeding embedding learning; the latter leverages an attention mechanism to learn consistency between meta-path and node attributes, allowing network topology and nodal attributes to be equally emphasized during the label distribution learning. By using KL-divergence and additional constraints, HGDL delivers an end-to-end solution for learning and predicting label distribution for nodes. Both theoretical and empirical studies substantiate the effectiveness of our HGDL approach. 
    more » « less
  8. Label Distribution Learning (LDL), as a more general learning setting than generic single-label and multi-label learning, has been commonly used in computer vision and many other applications. To date, existing LDL approaches are designed and applied to data without considering the interdependence between instances. In this paper, we propose a Graph Label Distribution Learning (GLDL) framework, which explicitly models three types of relationships: instance-instance, label-label, and instance-label, to learn the label distribution for networked data. A label-label network is learned to capture label-to-label correlation, through which GLDL can accurately learn label distributions for nodes. Dual graph convolution network (GCN) Co-training with heterogeneous message passing ensures two GCNs, one focusing on instance-instance relationship and the other one targeting label-label correlation, are jointly trained such that instance-instance relationship can help induce label-label correlation and vice versa. Our theoretical study derives the error bound of GLDL. For verification, four benchmark datasets with label distributions for nodes are created using common graph benchmarks. The experiments show that considering dependency helps learn better label distributions for networked data, compared to state-of-the-art LDL baseline. In addition, GLDL not only outperforms simple GCN and graph attention networks (GAT) using distribution loss but is also superior to its variant considering label-label relationship as a static network. GLDL and its benchmarks are the first research endeavors to address LDL for graphs. Code and benchmark data are released for public access. 
    more » « less
  9. Wooldridge, Michael; Dy, Jennifer; Natarajan, Sriraam (Ed.)